Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedups: remove dependency on c++ #2796

Merged
merged 19 commits into from
Jun 12, 2024

Conversation

black-sliver
Copy link
Member

What is this fixing or adding?

Removes dependency on C++ for _speedups.

But Why? C++ adds extra dependencies that need to be loaded into the process and afaik only _speedup requires them at the moment. Also currently we also don't handle C++ exceptions that could occur. This change fixes both and saves some memory because it should always have less memory overhead.

How was this tested?

Only pytest and a few benchmarks and tests on C level. I wanted to push this so we can discuss.

@ScipioWright ScipioWright added is: enhancement Issues requesting new features or pull requests implementing new features. affects: core Issues/PRs that touch core and may need additional validation. labels Feb 5, 2024
_speedups.pyx Outdated Show resolved Hide resolved
@black-sliver
Copy link
Member Author

Here are some numbers for the case where len() is way bigger than 1023 from my machine:

filling int32Set(50000) with 50000 out of 100000 took 896.549 us
filling int32Set(1023) with 50000 out of 100000 took 1381.6 us
filling std::set with 50000 out of 100000 took 6166.67 us 
testing 10000000 in int32Set(50000) took 92852.9 us
testing 10000000 in int32Set(1023) took 194879 us
testing 10000000 in std::set took 1.08036e+06 us

I'm not sure why std::set is that much slower to be honest, but that's just there for reference.

So, the 1023 set is "half as fast", but it uses only ~230KB RAM compared to ~800KB for the "full" one.

intset.h Outdated Show resolved Hide resolved
@black-sliver
Copy link
Member Author

I played around with gcc -fanalyzer and it seems like gcc does not realize that memset(set, 0, size); sets all buckets' counts to 0. It enters the realloc code path immediately and claims to be able to enter the malloc code path after that, so I am a bit confused :D
If i instead run memset() in a loop on individual buckets, that problem disappears (so it's probably not about size_t overflow).

@agilbert1412 agilbert1412 added the waiting-on: peer-review Issue/PR has not been reviewed by enough people yet. label Feb 13, 2024
@black-sliver
Copy link
Member Author

I ran ubsan on my local test code and nothing popped up. A thing to note is that the struct with the variable sized array is not valid C++, but it is valid C and supported by extensions to C++ in both GCC and clang (and probably msvc, but I haven't checked). I am still confused about the memset not working for -fanalyzer. As far as I am aware, it is valid to cast to bytes and back as long as it's trivially copyable. (If I overlooked something, I'm hopeful that the tests crash before prod crashes, but technically I could also write a test in C (and C++), install latest clang via gh actions and see if it behaves correctly, if that makes anyone sleep better.)

Either way, I'd still like @beauxq to give feedback on the changed realloc before hitting merge.

@beauxq
Copy link
Collaborator

beauxq commented Feb 18, 2024

yeah, realloc change looks good to me 👍

INTxx_MIN % 1 would not produce a negative number, so the test was flawed.
Copy link
Member

@Berserker66 Berserker66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to find time to really dig into this one, but I think RC testing and unittests might have to do. Code looks fine at a glance.

@black-sliver black-sliver merged commit acf85eb into ArchipelagoMW:main Jun 12, 2024
20 checks passed
agilbert1412 pushed a commit to agilbert1412/Archipelago that referenced this pull request Jun 13, 2024
* Speedups: remove dependency on c++

* Speedups: intset: handle malloc failing

* Speedups: intset: fix corner case for int64 on 32bit systems

original idea was to only use bucket->val if int<pointer,
but we always have a union now anyway

* Speedups: add size comment to player_set bucket configuration

* test: more tests for LocationStore.find_item

* test: require _speedups in CI

This kind of tests that the build succeeds.

* test: even more tests for LocationStore.find_item

* Speedups: intset uniform comment style

* Speedups: intset: avoid memory leak when realloc fails

* Speedups: intset: make `gcc -pedantic -std=c99 -fanalyzer` without warnings

Unnamed unions are not in C99, this got fixed.
The overhead of setting count=0 is minimal or optimized-out and silences -fanalizer (see comment).

* Speedups: don't leak memory in case of exception

* Speedups: intset: validate alloc and free

This won't happen in our cython, but it's still a good addition.

* CI: add test framework for C/C++ code

* CI: ctest: fix cwd

* Speedups: intset: ignore msvc warning

* Tests: intset: revert attempt at no-asan

We solve this with env vars in ctest now, and this fails for msvc.

* Test: cpp: docs: fix typo

* Test: cpp: docs: fix another typo

* Test: intset: proper bucket count for Negative test

INTxx_MIN % 1 would not produce a negative number, so the test was flawed.
jnschurig pushed a commit to Tranquilite0/Archipelago-SoulBlazer that referenced this pull request Jun 13, 2024
* Speedups: remove dependency on c++

* Speedups: intset: handle malloc failing

* Speedups: intset: fix corner case for int64 on 32bit systems

original idea was to only use bucket->val if int<pointer,
but we always have a union now anyway

* Speedups: add size comment to player_set bucket configuration

* test: more tests for LocationStore.find_item

* test: require _speedups in CI

This kind of tests that the build succeeds.

* test: even more tests for LocationStore.find_item

* Speedups: intset uniform comment style

* Speedups: intset: avoid memory leak when realloc fails

* Speedups: intset: make `gcc -pedantic -std=c99 -fanalyzer` without warnings

Unnamed unions are not in C99, this got fixed.
The overhead of setting count=0 is minimal or optimized-out and silences -fanalizer (see comment).

* Speedups: don't leak memory in case of exception

* Speedups: intset: validate alloc and free

This won't happen in our cython, but it's still a good addition.

* CI: add test framework for C/C++ code

* CI: ctest: fix cwd

* Speedups: intset: ignore msvc warning

* Tests: intset: revert attempt at no-asan

We solve this with env vars in ctest now, and this fails for msvc.

* Test: cpp: docs: fix typo

* Test: cpp: docs: fix another typo

* Test: intset: proper bucket count for Negative test

INTxx_MIN % 1 would not produce a negative number, so the test was flawed.
sflavelle pushed a commit to sflavelle/Archipelago-tgc that referenced this pull request Jun 20, 2024
* Speedups: remove dependency on c++

* Speedups: intset: handle malloc failing

* Speedups: intset: fix corner case for int64 on 32bit systems

original idea was to only use bucket->val if int<pointer,
but we always have a union now anyway

* Speedups: add size comment to player_set bucket configuration

* test: more tests for LocationStore.find_item

* test: require _speedups in CI

This kind of tests that the build succeeds.

* test: even more tests for LocationStore.find_item

* Speedups: intset uniform comment style

* Speedups: intset: avoid memory leak when realloc fails

* Speedups: intset: make `gcc -pedantic -std=c99 -fanalyzer` without warnings

Unnamed unions are not in C99, this got fixed.
The overhead of setting count=0 is minimal or optimized-out and silences -fanalizer (see comment).

* Speedups: don't leak memory in case of exception

* Speedups: intset: validate alloc and free

This won't happen in our cython, but it's still a good addition.

* CI: add test framework for C/C++ code

* CI: ctest: fix cwd

* Speedups: intset: ignore msvc warning

* Tests: intset: revert attempt at no-asan

We solve this with env vars in ctest now, and this fails for msvc.

* Test: cpp: docs: fix typo

* Test: cpp: docs: fix another typo

* Test: intset: proper bucket count for Negative test

INTxx_MIN % 1 would not produce a negative number, so the test was flawed.
qwint pushed a commit to qwint/Archipelago that referenced this pull request Jun 24, 2024
* Speedups: remove dependency on c++

* Speedups: intset: handle malloc failing

* Speedups: intset: fix corner case for int64 on 32bit systems

original idea was to only use bucket->val if int<pointer,
but we always have a union now anyway

* Speedups: add size comment to player_set bucket configuration

* test: more tests for LocationStore.find_item

* test: require _speedups in CI

This kind of tests that the build succeeds.

* test: even more tests for LocationStore.find_item

* Speedups: intset uniform comment style

* Speedups: intset: avoid memory leak when realloc fails

* Speedups: intset: make `gcc -pedantic -std=c99 -fanalyzer` without warnings

Unnamed unions are not in C99, this got fixed.
The overhead of setting count=0 is minimal or optimized-out and silences -fanalizer (see comment).

* Speedups: don't leak memory in case of exception

* Speedups: intset: validate alloc and free

This won't happen in our cython, but it's still a good addition.

* CI: add test framework for C/C++ code

* CI: ctest: fix cwd

* Speedups: intset: ignore msvc warning

* Tests: intset: revert attempt at no-asan

We solve this with env vars in ctest now, and this fails for msvc.

* Test: cpp: docs: fix typo

* Test: cpp: docs: fix another typo

* Test: intset: proper bucket count for Negative test

INTxx_MIN % 1 would not produce a negative number, so the test was flawed.
@black-sliver black-sliver deleted the speedups-no-cpp branch September 18, 2024 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects: core Issues/PRs that touch core and may need additional validation. is: enhancement Issues requesting new features or pull requests implementing new features. waiting-on: peer-review Issue/PR has not been reviewed by enough people yet.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants